Unraveling the Impact of Randomization Techniques: A Case Study on Uber’s Tipping Experiment

university of michigan
statistical analysis
This post is continuation of my understanding about experimental design & analysis, here I am trying to discuss about randomization technique with help of uber tipping experiment use case for their drivers conducted couple of years back.



March 25, 2024

This notobook is continuation of my understanding about experimental design & analysis, here I am trying to discuss about randomization technique with help of uber tipping experiment use case for their drivers conducted couple of years back.

Randomization stands as a cornerstone in the edifice of experimental design, offering a robust defense against biases and confounding variables. This blog post embarks on an exploration of three pivotal randomization techniques — access, timing, and encouragement randomization. Each method plays a vital role in fortifying the integrity of experimental outcomes, ensuring that the causal inferences drawn are devoid of bias. To bring these concepts to life, we delve into the Uber tipping experiment, elucidating how each randomization technique can be applied. We further enhance our exploration with Python illustrations, showcasing their practical implementation and offering insights through statistical models.

Access Randomization

At the heart of access randomization lies the equitable distribution of subjects into treatment and control groups. This approach ensures that the intervention under scrutiny is exclusively available to the treatment group, providing a clear demarcation for comparative analysis.

Uber Tipping Experiment Application:

Imagine a scenario where Uber wishes to assess the impact of a new tipping feature on driver satisfaction. Utilizing access randomization, drivers are bifurcated into two cohorts; one gains access to the tipping functionality (treatment), while the other continues without it (control).

import random

def access_randomization(drivers):
    treatment_group = random.sample(drivers, len(drivers) // 2)
    control_group = [driver for driver in drivers if driver not in treatment_group]
    return treatment_group, control_group
import numpy as np

# Simulating driver IDs
drivers = np.arange(1, 101)

# Randomly assigning drivers to treatment and control groups
treatment_group = drivers[:50]  # First 50 drivers
control_group = drivers[50:]  # Remaining drivers

Timing Randomization:

Timing randomization introduces variability in the temporal aspect of treatment delivery. This method is particularly beneficial when all subjects are destined to receive the treatment, but the sequence of administration is randomized.

Uber Tipping Experiment Application:

In applying this to our Uber case, let’s consider a phased rollout of the tipping feature. Drivers are randomly assigned to different phases, ensuring an unbiased evaluation of the feature’s impact over time.

def timing_randomization(drivers, num_waves):
    wave_size = len(drivers) // num_waves
    waves = [drivers[i:i+wave_size] for i in range(0, len(drivers), wave_size)]
    return waves
# Defining phases for feature rollout
phases = ['Phase 1', 'Phase 2', 'Phase 3']

# Assigning drivers to phases
driver_phases = np.random.choice(phases, size=len(drivers))

# Analyzing the distribution of drivers across phases
for phase in phases:
    print(f"{phase} Drivers:", np.sum(driver_phases == phase))
Phase 1 Drivers: 31
Phase 2 Drivers: 38
Phase 3 Drivers: 31

Encouragement Randomization:

Encouragement randomization is employed when the treatment is universally accessible, but a nudge is given to a randomly selected subgroup to encourage participation. This technique is instrumental in discerning the effect of encouragement on treatment uptake.

Uber Tipping Experiment Application:

In this context, while the tipping feature is available to all drivers, a randomized subset receives motivational messages or incentives to encourage the use of this feature, aiding in the assessment of encouragement’s effectiveness.

def encouragement_randomization(drivers, encouragement_ratio):
    num_encouraged = int(len(drivers) * encouragement_ratio)
    encouraged_group = random.sample(drivers, num_encouraged)
    not_encouraged_group = [driver for driver in drivers if driver not in encouraged_group]
    return encouraged_group, not_encouraged_group
# Defining encouragement proportion
encouragement_ratio = 0.3

# Randomly selecting drivers for encouragement
encouraged_drivers = np.random.choice(drivers, size=int(len(drivers) * encouragement_ratio), replace=False)

# Examining the encouraged group
print("Encouraged Drivers:", encouraged_drivers)
Encouraged Drivers: [66 41 47 75  5 34 54 17 20 59 73 24 95 29 16 65 44 23 45 11 70 25 76 74
 80 63 85 50 38 72]

Statistical Analysis with Regression:

To delve deeper into the data’s story, a regression analysis can provide quantifiable insights into the factors influencing tipping behavior.

Regression Model Summary:

import statsmodels.api as sm

# Assuming 'df' is a DataFrame with 'tip_amount', 'treated', 'encouraged', and other relevant variables
X = sm.add_constant(df[['treated', 'encouraged']])
y = df['tip_amount']

# Fitting an OLS regression model
model = sm.OLS(y, X).fit()

# Displaying the model summary
NameError: name 'df' is not defined

This model allows us to parse the influence of being in the treatment group and receiving encouragement on the tipping amounts, controlling for other variables as necessary.

